---
title: GDPR & Compliance Module
type: concept
tags: [gdpr, compliance, coppa, privacy, data-protection, legal]
sources:
  - (D) Sig Dug instruction 2026-04-18
  - (E) Code audit 2026-04-18
authority: Single source of truth for all GDPR/compliance behavior. All services must reference this page. No scattered GDPR logic is authoritative.
---

# GDPR & Compliance Module

> **⚠️ Non-Negotiable (Sig, 2026-04-18):** GDPR must be enforced by the system, not described in text. Every rule here must map to a concrete API endpoint, DB write, scheduled job, or audit log entry. If it cannot be enforced, it is not in scope.

---

## 1. Data Ownership Model

### Controller vs Processor

| Role | Party | Scope |
|---|---|---|
| **Data Controller** | School (or parent for B2C — out of scope v1) | Determines purpose and means of processing children's data |
| **Data Processor** | Pickatale | Processes data on behalf of the controller, per DPA |
| **Sub-processor** | OpenAI (GPT-4o), Azure TTS, Replicate | Pickatale must ensure sub-processor DPAs are in place |

**What this means in practice:**
- Pickatale **cannot** use a child's reading data for any purpose outside the school's instruction
- Pickatale **cannot** train models on personal child data without explicit controller consent
- Schools are responsible for obtaining parental consent through their own processes (letters home, school policy) — Pickatale does not gate student activation on parental consent in v1
- Pickatale is responsible for enforcing retention, deletion, and access rights once data is in the system

### How This Changes Per Tenant

| Tenant Type | Controller | DPA Required |
|---|---|---|
| UK/EU school | School (head teacher or DPO) | Yes — captured in Teacher Portal setup wizard |
| Government contract | Education authority | Yes — ingested via `external_entitlements` API with DPA reference |
| Individual teacher (free tier) | Teacher personally | Simplified consent notice at signup |

**DPA storage:** `school_dpa_agreements` table — `school_id`, `accepted_by` (user_id), `accepted_at`, `dpa_version`, `ip_address`. Written at setup wizard completion.

```sql
CREATE TABLE school_dpa_agreements (
  id           VARCHAR(36)  NOT NULL PRIMARY KEY,
  school_id    VARCHAR(36)  NOT NULL,
  accepted_by  VARCHAR(36)  NOT NULL,   -- user_id of school_admin or teacher
  accepted_at  DATETIME     NOT NULL,
  dpa_version  VARCHAR(20)  NOT NULL,   -- e.g. '2026-04-01'
  ip_address   VARCHAR(45)  NOT NULL,
  user_agent   TEXT         NULL,
  INDEX idx_school_id (school_id)
);
```

---

## 2. Lawful Basis

### Basis Per Data Type — v1 School-First Model

> **⚠️ Non-Negotiable (Sig, 2026-04-18):** Lawful basis for all student data in v1 is the school contract / DPA. No per-student parental consent gate in the platform. Schools manage parental consent outside the system.

| Data Category | Lawful Basis | Who Captures | Where Stored |
|---|---|---|---|
| Child reading activity (telemetry, progress) | **School contract / DPA** | School admin at setup wizard | `school_dpa_agreements` |
| Fluency audio recordings | **School contract / DPA** + data minimisation (deleted immediately after transcription) | N/A — no separate consent gate | `school_dpa_agreements` covers; raw audio never persisted |
| Finger-tracking / touch telemetry | **School contract / DPA** + data minimisation (anonymised after session) | N/A — no separate consent gate | `school_dpa_agreements` covers; raw coords not retained |
| Parent email / contact data | **Legitimate interest** (parent progress access, invited by teacher) | Parent at invite acceptance | `users.consent_at` |
| Analytics / aggregated data | **Legitimate interest** (product improvement, no PII) | N/A — anonymised before use | — |

### Consent Capture

All explicit consent events write to `consent_records`:

```sql
CREATE TABLE consent_records (
  id            VARCHAR(36)   NOT NULL PRIMARY KEY,
  subject_id    VARCHAR(36)   NOT NULL,   -- student_id or user_id
  subject_type  ENUM('student','user') NOT NULL,
  consent_type  ENUM('audio_recording','touch_telemetry','marketing','research') NOT NULL,
  granted       BOOLEAN       NOT NULL,
  granted_by    VARCHAR(36)   NOT NULL,   -- user_id of parent or guardian
  granted_at    DATETIME      NOT NULL,
  withdrawn_at  DATETIME      NULL,       -- set on withdrawal, record kept for audit
  ip_address    VARCHAR(45)   NOT NULL,
  source        VARCHAR(100)  NOT NULL,   -- e.g. 'parent_portal_invite', 'teacher_bulk_import'
  INDEX idx_subject (subject_id, consent_type),
  INDEX idx_granted_by (granted_by)
);
```

**Audit rule:** Consent records are **never deleted** — only `withdrawn_at` is set. The audit trail of what was consented to and when must be permanently preserved.

---

## 3. Data Subject Flows

All flows write to `audit_log`. All flows send a notification. All flows are executable via API.

### 3.1 Data Export (Right of Access)

**Trigger:** Parent or teacher submits export request via portal, or `POST /api/v1/gdpr/export` is called.

**API:** `POST /api/v1/gdpr/export`
```json
{ "subject_id": "<student_id>", "requested_by": "<user_id>", "reason": "parent_request" }
```

**Steps:**
1. Write `gdpr_requests` row: `{ type: 'export', subject_id, requested_by, status: 'pending', created_at }`
2. Write `audit_log`: `{ action: 'gdpr_export_requested', subject_id, requested_by, timestamp }`
3. Async job (`gdpr_export_worker`) runs within 30 days (target: 72h):
   - Collect from: Account Center (profile), Reader App (sessions, miles, progress), Telemetry (events), Learner Bot (memories, reports), LRS (xAPI statements), Teacher Portal (class membership)
   - Assemble into structured JSON
   - Encrypt and upload to secure storage
   - Generate signed URL (24h expiry)
4. Update `gdpr_requests.status = 'complete'`, write `completed_at`, store `export_url`
5. Send `DATA_EXPORT_READY` notification to `requested_by`
6. Write `audit_log`: `{ action: 'gdpr_export_complete', subject_id, requested_by, timestamp }`

**SLA:** 30 days (GDPR Article 15). System targets 72h. Alert if not completed within 7 days.

---

### 3.2 Data Deletion (Right to Erasure)

**Trigger:** `POST /api/v1/gdpr/delete`
```json
{ "subject_id": "<student_id>", "requested_by": "<user_id>", "reason": "erasure_request" }
```

**Steps — executed in order, all within a single tracked job:**

| Step | Action | Service |
|---|---|---|
| 1 | `students.state = 'deleted'`, `deleted_at = NOW()` | Teacher Portal DB |
| 2 | DELETE all telemetry events for student | Telemetry DB |
| 3 | DELETE all Learner Bot memories, reports, vocab_gaps | Learner Bot DB |
| 4 | DELETE all reading sessions and page events | Reader App DB |
| 5 | DELETE LRS xAPI statements for subject | LRS |
| 6 | Anonymise `users` row: name → 'Deleted User', email → `{uuid}@deleted.invalid`, phone → NULL | Account Center |
| 7 | **RETAIN** `audit_log` rows but strip PII from metadata fields | Account Center |
| 8 | **RETAIN** `consent_records` rows (audit trail of what was consented) | Account Center |
| 9 | **RETAIN** `school_dpa_agreements` (controller accountability) | Account Center |
| 10 | Update `gdpr_requests.status = 'complete'` | Account Center |
| 11 | Send `DATA_DELETION_COMPLETE` notification | Notification service |
| 12 | Write `audit_log`: `gdpr_deletion_complete` | Account Center |

**SLA:** 30 days (GDPR Article 17). Alert if not completed within 7 days.

**What is never deleted:**
- `audit_log` rows (PII stripped, rows retained for 7 years)
- `consent_records` (withdrawal recorded, record retained)
- `school_dpa_agreements` (controller accountability)

---

### 3.3 Consent Withdrawal

**Trigger:** Parent withdraws consent via parent portal or `POST /api/v1/gdpr/consent/withdraw`
```json
{ "subject_id": "<student_id>", "consent_type": "audio_recording", "withdrawn_by": "<user_id>" }
```

**Steps:**
1. Set `consent_records.withdrawn_at = NOW()` for matching rows
2. **Immediately stop** collection of that data type (enforced via `checkConsent()` middleware)
3. Delete any already-collected data of that type (e.g. if audio: purge all recordings for subject)
4. Write `audit_log`: `{ action: 'consent_withdrawn', subject_id, consent_type, withdrawn_by, timestamp }`
5. Notify teacher: `CONSENT_WITHDRAWN` event
6. Update `gdpr_requests` if withdrawal was requested via a formal flow

**`checkConsent()` middleware:**
```typescript
async function checkConsent(studentId: string, consentType: ConsentType): Promise<boolean> {
  const record = await db.consent_records.findFirst({
    where: {
      subject_id: studentId,
      consent_type: consentType,
      granted: true,
      withdrawn_at: null
    }
  });
  return !!record;
}
// Called before: audio recording, finger-tracking telemetry collection
// If false → skip collection silently; do not error (child UX unaffected)
```

---

### 3.4 Parent Access Request

**Trigger:** Parent requests to view their child's data via parent portal.

**API:** `GET /api/v1/parent/child/:student_id/data-summary`

**Permission check:** `parent_links.parent_id = requesting_user_id AND parent_links.student_id = student_id`

**Returns:**
- Reading progress (books read, miles, FK level)
- Vocabulary gaps (anonymised word list, no session context)
- Active consents (what data is being collected and since when)
- Learner Bot latest report (same as teacher digest, parent-friendly format)

**What parents cannot access:**
- Raw telemetry event stream (too granular, not required by GDPR)
- Other children's data
- Teacher notes or class-level analytics

**Audit log:** `{ action: 'parent_data_access', subject_id, requested_by, timestamp }` on every call.

---

## 4. Data Retention Engine

Not a table — enforced by scheduled jobs.

### Retention Rules

| Data | Retention | Enforcement |
|---|---|---|
| Telemetry events | 90 days | Cron job: daily purge |
| Raw audio (fluency) | 0 days | Deleted immediately after transcription in same request |
| Raw touch coordinates | Session only | Anonymised to aggregate stats at session end, raw data deleted |
| Reading sessions, Bot memories | 3 years | Cron job: annual scan |
| LRS xAPI statements | Indefinite | Manual deletion only (GDPR request) |
| Audit logs | 7 years | Cron job: annual scan; PII stripped after user deletion |
| `consent_records` | Permanent | Never auto-deleted |
| `school_dpa_agreements` | Permanent | Never auto-deleted |
| `gdpr_requests` | 7 years | Cron job: annual scan |

### Enforcement Jobs

```
03:00 UTC daily   — Telemetry: DELETE events WHERE created_at < NOW() - INTERVAL 90 DAY
02:00 UTC Sunday  — Reader: DELETE sessions WHERE created_at < NOW() - INTERVAL 3 YEAR
                           Learner Bot: DELETE memories WHERE created_at < NOW() - INTERVAL 3 YEAR
02:30 UTC daily   — Analytics: anonymise raw touch data older than 24h
01:00 UTC annual  — Audit: strip PII from audit_log WHERE user deleted AND created_at < NOW() - INTERVAL 7 YEAR
```

Every job writes result to `retention_job_log`:
```sql
CREATE TABLE retention_job_log (
  id           VARCHAR(36) NOT NULL PRIMARY KEY,
  job_name     VARCHAR(100) NOT NULL,
  ran_at       DATETIME NOT NULL,
  rows_deleted INT NOT NULL DEFAULT 0,
  rows_anonymised INT NOT NULL DEFAULT 0,
  error        TEXT NULL,
  duration_ms  INT NOT NULL
);
```

**Alert trigger:** Any retention job that deletes 0 rows when >0 were expected, or any job that errors, fires an internal alert to platform_admin.

### Raw Audio — Zero Retention

Raw audio is never written to disk in a persistent location:

```typescript
// Fluency assessment — audio handling
const transcript = await transcribeAudio(audioBuffer);  // cloud API call
// audioBuffer is never written to DB or file storage
// transcript is written to fluency_assessments table
// audioBuffer goes out of scope and is GC'd
// If transcription fails → log error, do NOT retain audio for retry
```

---

## 5. Cross-Border Data Transfer

### Data Location

| Data Store | Location | Notes |
|---|---|---|
| Primary DB (MySQL shared) | Ukraine (readingtester.com server) | EU adequacy decision does not cover Ukraine — SCCs required |
| Backups | Same server | Same rules apply |
| LLM inference (GPT-4o) | OpenAI US servers | Covered by OpenAI DPA — only text sent, no PII in prompts |
| TTS (Azure Neural, pending) | EU region preferred | Request Azure EU data residency when provisioning |
| Replicate (Qwen3 TTS, current) | US | No PII sent — only text to synthesise |

### Transfer Safeguards

**Ukraine server (primary):**
- EU personal data stored in Ukraine requires Standard Contractual Clauses (SCCs) as transfer mechanism
- Pickatale must execute SCCs with any EU school controller before that school's data is processed
- **Action required (Needs Approval):** Legal review of SCC template before first EU school onboards commercially

**OpenAI sub-processor:**
- Only leveled/adapted text is sent — never student names, IDs, or identifiable data in prompts
- Prompt construction rule: `NEVER include student_id, name, or school in LLM prompt payloads`
- OpenAI DPA: https://openai.com/policies/data-processing-addendum — must be signed

**Enforcement rule in code:**
```typescript
// Before any LLM call — strip PII
function sanitiseForLLM(payload: any): any {
  const { student_id, name, email, school_id, teacher_id, ...safe } = payload;
  return safe;  // Only content/text fields passed to LLM
}
```

---

## 6. Breach Handling

### Detection Triggers

| Trigger | Source | Severity |
|---|---|---|
| Unauthorised access to student data endpoint | API gateway — 401/403 spike | High |
| Mass data export from single IP | Rate limiter — >10 exports/hour | Critical |
| DB credential exposure (detected in logs) | Log scanner | Critical |
| Failed login spike on child accounts | Auth service — >50 failures/min | Medium |
| Deletion of audit_log rows | DB trigger — writes to `audit_tampering_log` | Critical |

### Breach Response Workflow

```
Detection
   ↓
1. IMMEDIATE (auto): Write to breach_log table — {detected_at, trigger, severity, affected_records_estimate}
2. IMMEDIATE (auto): Alert platform_admin via internal alerting
3. WITHIN 1 HOUR (manual): Platform admin assesses scope — personal data affected? Yes/No
4. IF personal data affected:
   a. WITHIN 24 HOURS: Internal incident report filed
   b. WITHIN 72 HOURS: Notify relevant supervisory authority (ICO for UK, DPA for EU territory)
   c. IF high risk to individuals: Notify affected data subjects (schools/parents) directly
5. Post-breach: Root cause analysis → remediation → `breach_log.resolved_at` set
```

```sql
CREATE TABLE breach_log (
  id                       VARCHAR(36)  NOT NULL PRIMARY KEY,
  detected_at              DATETIME     NOT NULL,
  trigger                  VARCHAR(255) NOT NULL,
  severity                 ENUM('low','medium','high','critical') NOT NULL,
  affected_records_estimate INT         NULL,
  personal_data_affected   BOOLEAN      NULL,   -- set by admin assessment
  notified_authority       BOOLEAN      NOT NULL DEFAULT FALSE,
  notified_subjects        BOOLEAN      NOT NULL DEFAULT FALSE,
  authority_notified_at    DATETIME     NULL,
  resolved_at              DATETIME     NULL,
  resolution_notes         TEXT         NULL,
  created_by               VARCHAR(36)  NULL,   -- NULL if auto-detected
  INDEX idx_detected_at (detected_at)
);
```

**72-hour rule:** If `personal_data_affected = TRUE` and `authority_notified_at` is still NULL after 72 hours from `detected_at` → escalation alert to platform_admin. The system cannot file the notification, but it can ensure it is not forgotten.

---

## 7. Audit Layer

Every GDPR action writes to `audit_log`. No exceptions.

### Required Audit Entries

| Action | `audit_log.action` value |
|---|---|
| DPA accepted | `dpa_accepted` |
| Consent granted | `consent_granted` |
| Consent withdrawn | `consent_withdrawn` |
| Export requested | `gdpr_export_requested` |
| Export completed | `gdpr_export_complete` |
| Deletion requested | `gdpr_deletion_requested` |
| Deletion completed | `gdpr_deletion_complete` |
| Parent data access | `parent_data_access` |
| Entitlement fallback | `entitlement_fallback` |
| Admin impersonation | `admin_impersonation` |
| Breach detected | `breach_detected` |
| Authority notified | `authority_notified` |

### `audit_log` Schema

```sql
CREATE TABLE audit_log (
  id           VARCHAR(36)   NOT NULL PRIMARY KEY,
  action       VARCHAR(100)  NOT NULL,
  user_id      VARCHAR(36)   NULL,       -- actor (NULL if system-generated)
  subject_id   VARCHAR(36)   NULL,       -- affected user/student
  school_id    VARCHAR(36)   NULL,       -- tenant scope
  ip_address   VARCHAR(45)   NULL,
  metadata     JSON          NULL,       -- action-specific detail; PII stripped on user deletion
  created_at   DATETIME      NOT NULL DEFAULT CURRENT_TIMESTAMP,
  INDEX idx_user_id (user_id),
  INDEX idx_subject_id (subject_id),
  INDEX idx_action (action),
  INDEX idx_created_at (created_at)
);
```

**Immutability rule:** `audit_log` rows are never updated or deleted (except PII strip on GDPR deletion — only `metadata` JSON is cleared, row is retained). Any attempt to DELETE from `audit_log` triggers a DB-level alert via `audit_tampering_log`.

```sql
-- Trigger to catch audit log tampering
CREATE TRIGGER audit_log_delete_guard
BEFORE DELETE ON audit_log
FOR EACH ROW
INSERT INTO audit_tampering_log (audit_log_id, attempted_at, attempted_by)
VALUES (OLD.id, NOW(), USER());
```

---

## 8. GDPR — v1 School-First Model (Locked)

> **⚠️ v1 Locked (Sig, 2026-04-18):** Lawful basis is school contract / DPA. School is controller. Pickatale is processor. No per-student parental consent gate in the platform. No COPPA activation gate. Schools manage their own parental consent obligations outside the platform.

| Topic | v1 Rule | Legal basis |
|---|---|---|
| Student activation | Activated immediately under school DPA — no age gate, no parental consent required in platform | GDPR Art. 6(1)(b) — contract (school DPA) |
| Parental consent | School's obligation, managed outside Pickatale | Controller (school) duty — not processor (Pickatale) |
| Voice audio (fluency) | **Deleted immediately after transcription. Never stored.** | GDPR data minimisation — Art. 5(1)(c) |
| Touch / finger-tracking | Raw coordinates anonymised at session end — aggregated stats only retained | GDPR data minimisation — Art. 5(1)(c) |
| Right to deletion | 30 days — school or parent can request via teacher portal | GDPR Art. 17 |
| Data minimisation | Collect only what is educationally necessary | GDPR Art. 5(1)(c) |
| Breach notification | 72 hours to supervisory authority if personal data affected | GDPR Art. 33 |
| Cross-border (Ukraine server) | SCCs required for EU schools | GDPR Art. 46 — awaiting legal review |

**Note on COPPA:** Pickatale's primary market is EU/UK — GDPR governs. COPPA (US) is not the primary framework and does not drive platform architecture in v1. Where US schools use the platform, the school DPA covers data processing obligations. No COPPA-specific activation gate is built.

---

## 9. Future Model: Direct-to-Consumer / Parent-Controlled Consent

> **⚠️ Out of Scope for v1. Do not build against this.**

When Pickatale operates a direct-to-consumer (B2C) path where a parent creates a child account without a school:

- The **parent** becomes the Data Controller (not the school)
- Lawful basis = **explicit parental consent** (not school DPA)
- Pickatale's role changes from pure processor to joint controller
- A separate consent capture flow is required — in-platform, verifiable, auditable
- COPPA verifiable parental consent mechanisms apply (email + verification step)
- GDPR-K age gate applies (16 in EU, 13 with consent in UK)
- `consent_records` table is already designed to support this — `source = 'parent_portal_self_register'`
- Separate privacy policy and DPA template required

**What needs to be built when in scope:**
1. Parent self-registration flow with age gate check
2. In-platform parental consent capture (per child, per data type)
3. Verifiable consent mechanism (email confirmation minimum; phone/ID for COPPA strict compliance)
4. Separate lawful basis stored per child (`lawful_basis` field on `students` table)
5. Separate privacy policy presented at registration

This section exists as a placeholder only. No implementation until Sig confirms B2C is in scope.

---

## References

- [[summaries/admin-compliance]] — deployment, secrets, data retention table (references this page)
- [[concepts/data-model/Entitlement Model]] — entitlement fallback audit requirements
- [[concepts/user-flows/Parent Flow]] — parent portal data access UI
- [[entities/Account Center]] — DPA storage, consent records, audit log ownership
- [[summaries/meta-authority]] — open decisions register (DECISION-02: ownership transfer)
